mutual exclusivity
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Mutual exclusivity as a challenge for deep neural networks
Strong inductive biases allow children to learn in fast and adaptable ways. Children use the mutual exclusivity (ME) bias to help disambiguate how words map to referents, assuming that if an object has one label then it does not need another. In this paper, we investigate whether or not vanilla neural architectures have an ME bias, demonstrating that they lack this learning assumption. Moreover, we show that their inductive biases are poorly matched to lifelong learning formulations of classification and translation. We demonstrate that there is a compelling case for designing task-general neural networks that learn through mutual exclusivity, which remains an open challenge.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > New York (0.04)
- (3 more...)
Mutual exclusivity as a challenge for deep neural networks
Strong inductive biases allow children to learn in fast and adaptable ways. Children use the mutual exclusivity (ME) bias to help disambiguate how words map to referents, assuming that if an object has one label then it does not need another. In this paper, we investigate whether or not vanilla neural architectures have an ME bias, demonstrating that they lack this learning assumption. Moreover, we show that their inductive biases are poorly matched to lifelong learning formulations of classification and translation. We demonstrate that there is a compelling case for designing task-general neural networks that learn through mutual exclusivity, which remains an open challenge.
Review for NeurIPS paper: Mutual exclusivity as a challenge for deep neural networks
The experiments assume that the models exhibit the bias if the probability of the new class(es) given a new word is one. It is not clear why it is expected for the models to assign the probability of one to the correct (new) class. When testing classification models, the correct class is the one that the model assigns the highest probability to, and this probability is often much smaller than one. This is because of the fact the sum of the prior probability over all incorrect classes is relatively large when there are many classes, even though the probability of individual classes is small. Moreover, when testing the models in a continual learning setup, the authors should continue training before the model overfits, and report the performance of the models on a held-out split.
Review for NeurIPS paper: Mutual exclusivity as a challenge for deep neural networks
The paper received mixed reviews from four reviewers. All the reviewers generally agree the paper is interesting and exposes an interesting research direction, which comes naturally to humans, but is currently lacking in most modern machine learning systems today. The main concerns raised by the reviewers are due to synthetic data and a missing concrete proposal for how to incorporate mutual exclusivity into the model as an inductive bias. The AC believes synthetic data is not sufficient reason for rejection because ultimately machine learning systems need to work on all cases. Several other minor concerns are also raised (architecture search, no other model has ME), but those reasons are minor and not sufficient weaknesses directly related to the contribution.
Neural DNF-MT: A Neuro-symbolic Approach for Learning Interpretable and Editable Policies
Baugh, Kexin Gu, Dickens, Luke, Russo, Alessandra
Although deep reinforcement learning has been shown to be effective, the model's black-box nature presents barriers to direct policy interpretation. To address this problem, we propose a neuro-symbolic approach called neural DNF-MT for end-to-end policy learning. The differentiable nature of the neural DNF-MT model enables the use of deep actor-critic algorithms for training. At the same time, its architecture is designed so that trained models can be directly translated into interpretable policies expressed as standard (bivalent or probabilistic) logic programs. Moreover, additional layers can be included to extract abstract features from complex observations, acting as a form of predicate invention. The logic representations are highly interpretable, and we show how the bivalent representations of deterministic policies can be edited and incorporated back into a neural model, facilitating manual intervention and adaptation of learned policies. We evaluate our approach on a range of tasks requiring learning deterministic or stochastic behaviours from various forms of observations. Our empirical results show that our neural DNF-MT model performs at the level of competing black-box methods whilst providing interpretable policies.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (8 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Mutual exclusivity as a challenge for deep neural networks
Strong inductive biases allow children to learn in fast and adaptable ways. Children use the mutual exclusivity (ME) bias to help disambiguate how words map to referents, assuming that if an object has one label then it does not need another. In this paper, we investigate whether or not vanilla neural architectures have an ME bias, demonstrating that they lack this learning assumption. Moreover, we show that their inductive biases are poorly matched to lifelong learning formulations of classification and translation. We demonstrate that there is a compelling case for designing task-general neural networks that learn through mutual exclusivity, which remains an open challenge.
A Bayesian Framework for Cross-Situational Word-Learning
For infants, early word learning is a chicken-and-egg problem. One way to learn a word is to observe that it co-occurs with a particular referent across different situations. Another way is to use the social context of an utterance to infer the in- tended referent of a word. Here we present a Bayesian model of cross-situational word learning, and an extension of this model that also learns which social cues are relevant to determining reference. We test our model on a small corpus of mother-infant interaction and find it performs better than competing models. Fi- nally, we show that our model accounts for experimental phenomena including mutual exclusivity, fast-mapping, and generalization from social cues.
The advent and fall of a vocabulary learning bias from communicative efficiency
Carrera-Casado, David, Ferrer-i-Cancho, Ramon
Biosemiosis is a process of choice-making between simultaneously alternative options. It is well-known that, when sufficiently young children encounter a new word, they tend to interpret it as pointing to a meaning that does not have a word yet in their lexicon rather than to a meaning that already has a word attached. In previous research, the strategy was shown to be optimal from an information theoretic standpoint. In that framework, interpretation is hypothesized to be driven by the minimization of a cost function: the option of least communication cost is chosen. However, the information theoretic model employed in that research neither explains the weakening of that vocabulary learning bias in older children or polylinguals nor reproduces Zipf's meaning-frequency law, namely the non-linear relationship between the number of meanings of a word and its frequency. Here we consider a generalization of the model that is channeled to reproduce that law. The analysis of the new model reveals regions of the phase space where the bias disappears consistently with the weakening or loss of the bias in older children or polylinguals. The model is abstract enough to support future research on other levels of life that are relevant to biosemiotics. In the deep learning era, the model is a transparent low-dimensional tool for future experimental research and illustrates the predictive power of a theoretical framework originally designed to shed light on the origins of Zipf's rank-frequency law.
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- (9 more...)
- Research Report > Experimental Study (0.47)
- Research Report > New Finding (0.34)
- Health & Medicine (0.67)
- Education (0.45)